Detecting cryptically simple protein sequences using the SIMPLE algorithm

نویسندگان

  • M. Mar Albà
  • Roman A. Laskowski
  • John M. Hancock
چکیده

MOTIVATION Low-complexity or cryptically simple sequences are widespread in protein sequences but their evolution and function are poorly understood. To date methods for the detection of low complexity in proteins have been directed towards the filtering of such regions prior to sequence homology searches but not to the analysis of the regions per se. However, many of these regions are encoded by non-repetitive DNA sequences and may therefore result from selection acting on protein structure and/or function. RESULTS We have developed a new tool, based on the SIMPLE algorithm, that facilitates the quantification of the amount of simple sequence in proteins and determines the type of short motifs that show clustering above a certain threshold. By modifying the sensitivity of the program simple sequence content can be studied at various levels, from highly organised tandem structures to complex combinations of repeats. We compare the relative amount of simplicity in different functional groups of yeast proteins and determine the level of clustering of the different amino acids in these proteins. AVAILABILITY The program is available on request or online at http://www.biochem.ucl.ac.uk/bsm/SIMPLE.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Detection and Discrimination of Theileria annulata and Theileria lestoquardi by using a Single PCR

  The aim of this study was to detect and differentiate Theileria annulata and T. lestoquardi (hirci) by PCR. Members of the genus Theileria are tick-borne hemoprotozoan parasites those cause fatal and enervating diseases of cattle and sheep in Iran . In order to develop a specific method for detecting and identification of Theileria species, specific primers from the surface protein (SP) seque...

متن کامل

Molecular diversity within and between Ajowan (Carum copticum L.) populations based on inter simple sequence repeat (ISSR) markers

Study of genetic relationships is a prerequisite for plant breeding activities as well as for conservation of genetic resources. In the present study, genetic diversity among and within 15 Iranian native Ajowan(Carum copticum L.) populations were determined using inter simple sequence repeat (ISSR) markers. Twelve selected primers produced 153 discernible bands, with 93 (60.78%) being ...

متن کامل

Comparing the Bidirectional Baum-Welch Algorithm and the Baum-Welch Algorithm on Regular Lattice

A profile hidden Markov model (PHMM) is widely used in assigning protein sequences to protein families. In this model, the hidden states only depend on the previous hidden state and observations are independent given hidden states. In other words, in the PHMM, only the information of the left side of a hidden state is considered. However, it makes sense that considering the information of the b...

متن کامل

Mining Biological Repetitive Sequences Using Support Vector Machines and Fuzzy SVM

Structural repetitive subsequences are most important portion of biological sequences, which play crucial roles on corresponding sequence’s fold and functionality. Biggest class of the repetitive subsequences is “Transposable Elements” which has its own sub-classes upon contexts’ structures. Many researches have been performed to criticality determine the structure and function of repetitiv...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Bioinformatics

دوره 18 5  شماره 

صفحات  -

تاریخ انتشار 2002